Evaluation of CASA and BSS models for cocktail-party speech segregation

نویسندگان

  • Frédéric Berthommier
  • Seungjin Choi
چکیده

For speech segregation, a blind separation model (BSS) is tested together with a CASA model which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the frequency domain in a variable number of subbands, which are processed independently. Then, we evaluate the gain, using reference signals recorded in isolation. Without using this reference, a coherence index is also established for the BSS model, which measures the degree of convergence. After a careful analysis, we find gains of about 1-3dB for the two methods, which are smaller than those published for the same task. The variation of the number of subbands allows an optimisation, and we obtain a significant peak at 4 subbands for the CASA model, and a smaller maximum at 2 subbands for the BSS model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of CASA and BSS models for subband cocktail-party speech separation

For speech segregation, a recurrent blind separation model (BSS) is tested together with a CASA model, which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the fre...

متن کامل

Comparative evaluation of CASA and BSS models for subband cocktail-party speech separation

For speech segregation, a blind separation model (BSS) is tested together with a CASA model which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the frequency doma...

متن کامل

Comparative evaluation of CA for subband cocktail-party

For speech segregation, a recurrent blind separation model (BSS) is tested together with a Computational Auditory Scene Analysis (CASA) model, which is based on the localisation cue and the evaluation of the Time Delay Of Arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For...

متن کامل

A Casa Front-end Using the Localisation Cue for Segregation and Then Cocktail-party Speech Recognition

We propose and test a cocktail-party recognition technique based on segregation applied before recognition. This CASA front-end uses the TDOA (Time Delay Of Arrival) evaluated within subbands in order to determine the Relative Level (RL) of two competing speech sources. To perform the evaluation of the model, we have recorded a stereo database ST-NB95 from the mono Numbers95. This is composed o...

متن کامل

Cocktail Party Processing

Speech segregation, or the cocktail party problem, has proven to be extremely challenging. This presentation describes a computational auditory scene analysis (CASA) approach to the cocktail party problem. This approach performs auditory segmentation and grouping in a two-dimensional time-frequency representation that encodes proximity in frequency and time, periodicity, amplitude modulation, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007